Main
Joshua Goldberg
Data scientist proficient in statistics, machine learning, and software engineering. Proficient in python, R, SQL, and C++ using functional programming and object-oriented design. Examples of my work include machine learning models to optimize sales and marketing programs with an estimated impact of $1 million net revenue annually at a financial institution; models to detect risky behavior across millions of third-party sellers on amazon.com, and forecasting models to predict demand of 500,000+ products on Amazon’s Marketplace. Outside of work, I enjoy distance running, reading, and developing/implementing algorithms in C++.
Industry Experience
Data Scientist
Amazon
Seattle, WA
Current - 2020
- Conceptualize and implement machine learning models for forecasting and supply-chain use-cases; own and maintain python production code in AWS / Amazon SageMaker that predicts customer demand of 500,000+ Amazon products worldwide
- Built, enhanced, and maintained natural-language machine learning models to detect fraud and abuse from third-party sellers’ communication with customers on Amazon’s marketplace; models detected and prevented thousands of fraud and abuse cases on a quarterly basis.
- Create software tools to monitor machine learning models in production
AVP, Lead Data Scientist
Nuveen
Chicago, IL
2020 - 2017
- Pioneered end-to-end (execution and experimental design) deep learning time series model for client onboarding; estimated impact of the model was $1 million net revenue annually that maximized client journey (improvement in client retention, client growth, etc.)
- Built recommendation engine for 150,000 clients in 50+ products
- Presented model/analysis to executive management; results included model adoption by 100+ sales people and a significant increase sales for clients treated by the model
- Conceptualized and created simulation engine that isolated, detected and measured the ROI impact of company sales events
Senior Equity Research Associate, Financial Services
Raymond James Financial, Inc.
Chicago, IL
2017 - 2014
- Built company and industry models using finance and statistical techniques, including regression and discounted cash flows (DCF)
Education
STEM Continuing Education
Various Institutions
N/A
Current - 2021
- I actively take STEM courses at different universities to enhance, revisit, or refine my technical skillset. Most of the courses are in computer science or mathematics.
- Harvard: Calculus 2 with Series and Differential Equations; Linear Algebra and Differential Equation
- University of Illinois Urbana-Champaign (UIUC): Calculus 1: First course in Calculus and Analytic Geometry
- Edmonds College: CS I, II, II; Courses in C/C++ covering Data structures & algorithms and object-oriented design and programming
M.S. in Applied Data Science
University of Chicago
Chicago, IL
2020
- Coursework in statistics, linear algebra, machine learning, and deep learning
B.S. in Accounting and Finance
University of South Florida
Tampa, FL
2013
Selected Code Repositories
Machine learning decision tree and data frame implementation in C++
Github
Seattle, WA
2021
- Authored with John Nguyen
Generative adversarial network used to generate musical samples
University of Chicago
Chicago, IL
2020
- Capstone project and paper authored with Terry Wang and Rima Mittal. Supervised by Yuri Balasanov
In my free time, I enjoy working with friends, peers, and colleagues on algorithm designs/implementations. Recently, we built data frame and decision tree classes in C++.
Teaching Experience
I am passionate about teaching and helping others. It brings me joy and satisfication to teach others new skills.
Python for Data Science
University of Chicago
Remote
Current - 2022
- Instructor
- Topics include introductory and advanced topics in python: variables, logical operators, containers, loops, conditionals, comprehensions, functions, object oriented (basics), advanced data analysis and manipulation with numpy and pandas, model evaluation, parallel computation, and APIs
Various courses
University of Chicago
Remote
Current - 2020
- TA
- Intro statistics, machine learning, time series analysis
MastersTrack Statistics for Machine Learning and Machine Learning Courses
University of Chicago
Remote
2022 - 2020
- Instructor and TA
- Statistics course: topics include simple and multiple regression, logistic regression, hypothesis testing, variable transformations. Machine learning course: topics include a survey of machine learning algorithms: kNN, support vector machine, decision tree, random forest, boosted trees, and clustering algorithms
Data Understanding via SQL, Databases, and R
University of Chicago
Remote
2021 - 2020
- Instructor and TA
- Topics include introduction to databases, mySQL, and R